fix the issue of GEMM validation failure by zhangnju · Pull Request #378 · ROCm/rocm-examples

zhangnju · 2025-12-16T15:06:04Z

Motivation

when trying GEMM sample on R9700, if I change A_rows,A_cols,B_cols from default value to be 4096, validation will fail

Technical Details

set the input matrix value to be random, which is the same with other GEMM application
develop a GEMM CPU validation function to test the GPU output , in order to speed up GEMM on CPU, I chose the block GEMM on CPU

Test Plan

run the default value: the validation test can pass:
./hip_matrix_multiplication
Matrix multiplication: [2048x1024] * [1024x1024], block size: 16x16
Validation passed.
input size to be 4096, it still can pass
./hip_matrix_multiplication --A_rows 4096 --A_cols 4096 --B_cols 4096
Matrix multiplication: [4096x4096] * [4096x4096], block size: 16x16
Validation passed.
run other size, and it still can pass
./hip_matrix_multiplication --A_rows 4096 --A_cols 512 --B_cols 2048
Matrix multiplication: [4096x512] * [512x2048], block size: 16x16
Validation passed.

Test Result

all the test can pass

Added/Updated documentation?

Yes
[* ] No, does not apply to this PR.

Included Visual Studio files?

Yes
[ *] No, does not apply to this PR.

Submission Checklist

New examples contain proper CMake and Make files.
I have updated relevant CI workflows and the new examples are being built and tested in CI.
Check and guard against unsupported ASICs if relevant dependent libraries have limited support.
[ *] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

zichguan-amd · 2025-12-16T21:26:47Z

cc @neon60 @j-stephan @adeljo-amd I wonder if this example overlaps with matrix multiplication from #375? If they are similar enough, we should probably just keep one.

zhangnju · 2025-12-17T15:05:31Z

I think these two examples have different kernels ,which should not be redundant

zichguan-amd

I'm OK with the change, the CPU/GPU error should be in the same range, we can definitely use double for more precision. I'll let others weigh in.

zichguan-amd · 2025-12-18T22:16:00Z

HIP-Basic/matrix_multiplication/main.hip

-    constexpr float b_value = 0.02F;
-    std::fill(B.begin(), B.end(), b_value);
+    // Set matrix elements to random value on the host.
+    for (size_t i = 0; i < A.size(); ++i) A[i] = static_cast<float>(rand() / RAND_MAX );


Should be static_cast<double>(rand()) / RAND_MAX, static_cast<float>(rand() / RAND_MAX ) would result in 0 most of the time (integer division)

Thanks for your reminding. we can change it to be "static_cast(rand() / (RAND_MAX+1.0f) );", and it will generate [0,1) random float.

zichguan-amd · 2025-12-18T22:16:12Z

HIP-Basic/matrix_multiplication/main.hip

-    std::fill(B.begin(), B.end(), b_value);
+    // Set matrix elements to random value on the host.
+    for (size_t i = 0; i < A.size(); ++i) A[i] = static_cast<float>(rand() / RAND_MAX );
+    for (size_t i = 0; i < B.size(); ++i) B[i] = static_cast<float>(rand() / RAND_MAX );


Same as above

I have updated it to be "static_cast(rand() / (RAND_MAX+1.0f) )", and verified that it can generate [0,1) random float value

zichguan-amd · 2025-12-18T22:18:21Z

HIP-Basic/matrix_multiplication/main.hip

+#include <cstdlib>
 #include <cassert>
 #include <cstddef>
+#include <memory>


Is this necessary?

I have removed them in the new commit

zhangnju requested a review from a team as a code owner December 16, 2025 15:06

zhangnju added 2 commits December 17, 2025 06:55

fix the issue of GEMM validation failure

a1a2c43

remove redundent vector head file

44dd3ca

zichguan-amd requested changes Dec 18, 2025

View reviewed changes

update

d77279b

zichguan-amd requested a review from neon60 December 22, 2025 22:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix the issue of GEMM validation failure#378

fix the issue of GEMM validation failure#378
zhangnju wants to merge 3 commits intoROCm:amd-stagingfrom
zhangnju:amd-staging

zhangnju commented Dec 16, 2025

Uh oh!

zichguan-amd commented Dec 16, 2025

Uh oh!

zhangnju commented Dec 17, 2025

Uh oh!

zichguan-amd left a comment

Uh oh!

zichguan-amd Dec 18, 2025

Uh oh!

zhangnju Dec 19, 2025

Uh oh!

zichguan-amd Dec 18, 2025

Uh oh!

zhangnju Dec 19, 2025

Uh oh!

zichguan-amd Dec 18, 2025

Uh oh!

zhangnju Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhangnju commented Dec 16, 2025

Motivation

Technical Details

Test Plan

Test Result

Added/Updated documentation?

Included Visual Studio files?

Submission Checklist

Uh oh!

zichguan-amd commented Dec 16, 2025

Uh oh!

zhangnju commented Dec 17, 2025

Uh oh!

zichguan-amd left a comment

Choose a reason for hiding this comment

Uh oh!

zichguan-amd Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

zhangnju Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

zichguan-amd Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

zhangnju Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

zichguan-amd Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

zhangnju Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants